Diabetic Retinopathy (DR) is a leading cause of vision loss in the world, and early DR detection is necessary to prevent vision loss and support an appropriate treatment. In this work, we leverage interactive machine learning and introduce a joint learning framework, termed DRG-Net, to effectively learn both disease grading and multi-lesion segmentation. Our DRG-Net consists of two modules: (i) DRG-AI-System to classify DR Grading, localize lesion areas, and provide visual explanations; (ii) DRG-Expert-Interaction to receive feedback from user-expert and improve the DRG-AI-System. To deal with sparse data, we utilize transfer learning mechanisms to extract invariant feature representations by using Wasserstein distance and adversarial learning-based entropy minimization. Besides, we propose a novel attention strategy at both low- and high-level features to automatically select the most significant lesion information and provide explainable properties. In terms of human interaction, we further develop DRG-Net as a tool that enables expert users to correct the system's predictions, which may then be used to update the system as a whole. Moreover, thanks to the attention mechanism and loss functions constraint between lesion features and classification features, our approach can be robust given a certain level of noise in the feedback of users. We have benchmarked DRG-Net on the two largest DR datasets, i.e., IDRID and FGADR, and compared it to various state-of-the-art deep learning networks. In addition to outperforming other SOTA approaches, DRG-Net is effectively updated using user feedback, even in a weakly-supervised manner.
translated by 谷歌翻译
Collecting large-scale medical datasets with fully annotated samples for training of deep networks is prohibitively expensive, especially for 3D volume data. Recent breakthroughs in self-supervised learning (SSL) offer the ability to overcome the lack of labeled training samples by learning feature representations from unlabeled data. However, most current SSL techniques in the medical field have been designed for either 2D images or 3D volumes. In practice, this restricts the capability to fully leverage unlabeled data from numerous sources, which may include both 2D and 3D data. Additionally, the use of these pre-trained networks is constrained to downstream tasks with compatible data dimensions. In this paper, we propose a novel framework for unsupervised joint learning on 2D and 3D data modalities. Given a set of 2D images or 2D slices extracted from 3D volumes, we construct an SSL task based on a 2D contrastive clustering problem for distinct classes. The 3D volumes are exploited by computing vectored embedding at each slice and then assembling a holistic feature through deformable self-attention mechanisms in Transformer, allowing incorporating long-range dependencies between slices inside 3D volumes. These holistic features are further utilized to define a novel 3D clustering agreement-based SSL task and masking embedding prediction inspired by pre-trained language models. Experiments on downstream tasks, such as 3D brain segmentation, lung nodule detection, 3D heart structures segmentation, and abnormal chest X-ray detection, demonstrate the effectiveness of our joint 2D and 3D SSL approach. We improve plain 2D Deep-ClusterV2 and SwAV by a significant margin and also surpass various modern 2D and 3D SSL approaches.
translated by 谷歌翻译
多级优化已被广泛用作无数机器学习问题的数学基础,例如超参数优化,元学习和增强学习,仅举几例。尽管如此,实施多级优化程序通常需要在数学和编程方面的专业知识,这在该领域的研究都阻碍了研究。我们通过引入贝蒂(Betty)(用于基于梯度的多级优化的高级软件库)迈出了缩小这一差距的第一步。为此,我们基于对多级优化作为数据流图的新解释开发自动分化过程。我们进一步将多级优化的主要组成部分作为Python类,以实现简单,模块化和可维护的编程。我们从经验上证明,Betty可以用作一系列多级优化程序的高级编程接口,同时观察到测试准确性的提高11 \%,GPU存储器使用率下降14 \%,而20 \%降低了。在多个基准上的现有实现的墙壁时间。该代码可从http://github.com/leopard-ai/betty获得。
translated by 谷歌翻译
传统的机器学习(ML)严重依赖于机器学习专家的手动设计,以决定学习任务,数据,模型,优化算法和评估指标,以及劳动密集型,耗时,不能像人类那样自主学习。在教育科学,自我导向的学习中,人类学习者在不需要动手指导的情况下选择学习任务和材料,已经显示出比被动教师引导的学习更有效。灵感来自自我导向的人类学习的概念,我们介绍了自我导向机器学习(SDML)的主要概念,并为SDML提出了一个框架。具体而言,我们设计SDML作为自我意识引导的自我指导的学习过程,包括内部意识和外部意识。我们提出的SDML进程从自我任务选择,自我数据选择,自我模型选择,自我优化策略选择和自我意识中选择的自我认识,没有人为指导。同时,SDML过程的学习性能是进一步提高自我意识的反馈。我们为基于多级优化的SDML提出了一种数学制定。此外,我们将案例研究与SDML的潜在应用一起,随后讨论未来的研究方向。我们希望SDML能够使机器能够进行人类的自我导向学习,并为人为一般情报提供新的视角。
translated by 谷歌翻译
从错误中学习是一种有效的学习方法,广泛用于人类学习,学习者将更加注重未来规避犯罪的错误。它有助于改善整体学习结果。在这项工作中,我们的目标是调查这种特殊学习能力的有效性如何用于改善机器学习模型。我们提出了一种简单有效的多层次优化框架,称为学习的错误(LFM),灵感来自错误驱动的学习,培训更好的机器学习模型。我们的LFM框架包括涉及三个学习阶段的配方。主要目标是通过使用重新加权技术训练模型来执行目标任务,以防止将来类似的错误。在这种制定中,我们通过最小化模型的验证丢失来学习类重量,并通过来自类明智性能和实际数据的图像生成器重新列出模型的验证丢失来重新列车。我们在图像分类数据集等差分架构搜索方法应用我们的LFM框架,如CiFar和Imagenet,结果表明了我们提出的策略的有效性。
translated by 谷歌翻译
在诸如DARTS等可分解神经结构搜索(NAS)算法中,用于更新模型权重的训练集和用于更新模型架构的验证集是从相同的数据分发采样的。因此,数据集中的罕见功能在训练期间无法获得足够的注意。在本文中,而不是引入更复杂的NAS算法,我们探讨了将质量合成数据集添加到培训中的想法可以帮助分类模型识别其弱点并提高识别准确性。我们介绍了一个名为“可怜的架构搜索的培训策略,使用生成模型(DASGM)”。“在DASGM中,训练集用于更新分类模型权重,而合成的数据集用于训练其架构。生成的图像具有来自培训集的不同分布,可以帮助分类模型了解更好的特征来识别其弱点。我们将达斯哥姆分配到多级优化框架中,并开发一个有效的算法来解决它。CiFar-10,CiFar-100的实验,Cifar-100,并且想象成展示了DASGM的有效性。将提供代码。
translated by 谷歌翻译
从一个人的错误中学习是一种有效的人类学习技术,学习者更多地关注在犯错误的主题上,以便加深他们的理解。在本文中,我们调查这种人类学习策略是否可以应用于机器学习。我们提出了一种新的机器学习方法,称为来自错误(LFM)的学习,其中学习者通过在修订期间更多地关注错误来提高其学习的能力。我们制定LFM作为三阶段优化问题:1)学习者学习;2)学习者重新学习专注于错误,而且;3)学习者验证其学习。我们开发了一种有效的算法来解决LFM问题。我们将LFM框架应用于CiFar-10,CiFar-100和ImageNet上的神经架构搜索。实验结果强烈展示了我们模型的有效性。
translated by 谷歌翻译
条件图像生成(CIG)是计算机视觉和机器学习中的广泛研究问题。给定类,CIG将此类的名称作为输入,生成属于此类的一组图像。在现有的CIG工作中,对于不同的类,它们的相应图像是独立生成的,而不考虑类之间的关系。在现实世界应用中,该类被组织成层次结构,并且它们的分层关系是发布的,用于生成高保真图像。在本文中,我们的目标是利用类层次结构进行有条件的图像生成。我们提出了两种结合类层次结构的方法:先前的控制和后约束。在先前的控制中,我们首先对类层次结构进行编码,然后将其作为在条件生成器中为生成图像而馈送。在Post约束中,在生成图像后,我们测量它们与类层次结构的一致性,并使用一致性分数来指导发电机的训练。基于这两个想法,我们提出了一个由三个模块组成的Treegan模型:(1)将类别的类层次结构(CHE)带到类别的层次结构及其文本名称作为输入,并为每个类学习嵌入;嵌入捕获类之间的分层关系; (2)一种条件图像生成器(CIG),它将Che-Degented嵌入类作为输入,生成属于此类的一组图像; (3)在生成的图像上执行分层分类的一致性检查器,并检查生成的图像是否与类层级兼容;一致性分数用于指导CIG生成层次结构兼容的图像。各个数据集的实验证明了我们方法的有效性。
translated by 谷歌翻译
这项研究提出了一个多模式的机器学习模型,以预测ICD-10诊断代码。我们开发了单独的机器学习模型,可以处理来自不同模式的数据,包括非结构化文本,半结构化文本和结构化表格数据。我们进一步采用了合奏方法来集成所有模式特异性模型以生成ICD-10代码。还提取了主要证据,以使我们的预测更具说服力和可解释。我们使用医学信息集市进行重症监护III(模拟-III)数据集来验证我们的方法。对于ICD代码预测,我们的表现最佳模型(Micro-F1 = 0.7633,Micro-AUC = 0.9541)显着超过其他基线模型,包括TF-IDF(Micro-F1 = 0.6721,Micro-AUC = 0.7879)和Text-CNN模型(Micro-F1 = 0.6569,Micro-AUC = 0.9235)。为了解释性,我们的方法在文本数据上实现了JACCARD相似性系数(JSC)为0.1806,在表格数据上分别获得了0.3105,训练有素的医生分别达到0.2780和0.5002。
translated by 谷歌翻译
Driven by improved architectures and better representation learning frameworks, the field of visual recognition has enjoyed rapid modernization and performance boost in the early 2020s. For example, modern ConvNets, represented by ConvNeXt, have demonstrated strong performance in various scenarios. While these models were originally designed for supervised learning with ImageNet labels, they can also potentially benefit from self-supervised learning techniques such as masked autoencoders (MAE). However, we found that simply combining these two approaches leads to subpar performance. In this paper, we propose a fully convolutional masked autoencoder framework and a new Global Response Normalization (GRN) layer that can be added to the ConvNeXt architecture to enhance inter-channel feature competition. This co-design of self-supervised learning techniques and architectural improvement results in a new model family called ConvNeXt V2, which significantly improves the performance of pure ConvNets on various recognition benchmarks, including ImageNet classification, COCO detection, and ADE20K segmentation. We also provide pre-trained ConvNeXt V2 models of various sizes, ranging from an efficient 3.7M-parameter Atto model with 76.7% top-1 accuracy on ImageNet, to a 650M Huge model that achieves a state-of-the-art 88.9% accuracy using only public training data.
translated by 谷歌翻译